# CLIP Visual Encoding
Clip Vit Base Patch32 Stanford Cars
A visual classification model fine-tuned on the Stanford Cars dataset based on the CLIP Vision Transformer architecture
Image Classification
Transformers

C
tanganke
4,143
1
Taiyi CLIP Roberta 102M Chinese
Apache-2.0
The first open-source Chinese CLIP model, pre-trained on 123 million image-text pairs, with a text encoder based on RoBERTa-base architecture.
Text-to-Image
Transformers Chinese

T
IDEA-CCNL
558
51
Featured Recommended AI Models